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Abstract. A new approach is proposed to the quantitative estimation of the complex¬ 
ity of multidimensional discrete sequences in terms of the shapes of their trajectories 
in the extended space of states. This approach is based on the study of the structural 
properties of sequences and is suitable for estimating the complexity of both chaotic 
and stochastic sequences. It is constructed on the method, proposed earlier by the au¬ 
thor, of symbolic CTQ-analysis of multidimensional discrete sequences and mappings. 

The algorithm proposed manipulates not only the frequency of occurrence of symbols, 
but also takes into account their sequence order. An example (financial time series) is 
given that demonstrates the application of the tools developed. 

Keywords: Discrete-time systems, Time-series analysis, Stochastic complexity, Estimation 
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1. Introduction 


The notion of ’’complexity” of an object is one of its most important structural- 
information characteristics and belongs to the class of fundamental scientihc concepts 


23 


The narrower notion of the ’’complexity of a dynamic process” is not an exception. 
This notion is related to the predictability and information capacity of processes j^. The 
complexity of a dynamic process is a part of the criteria for the classification of processes into 


deterministic, chaotic, and stochastic ones 14 . However, along with this, the questions of the 


definition and calculation of the complexity of dynamic processes remain methodologically 
open (^ . 

A quantitative approach to the notion of complexity was first formulated in the sta¬ 
tistical physics of equilibrium systems in 1877 by Ludwig Boltzmann, who introduced the 
concept of ’’entropy” [^, H = fc^lnlT, where W is the number of microstates of a system 
that can be implemented in the existing macroscopic state and ks is the Boltzmann con¬ 
stant. R. Hartley actually extended the principles of statistical physics to the description of 


the states of macrosystems and gave entropy an informational meaning 10 . 
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This idea was further developed in the works of T. Shannon on information theory 24 . 
In these works, Shannon also introduced the notion of entropy: 


H = - lnp{xi), 


( 1 . 1 ) 


where p{xi) is the probability distribution of independent random events Xj. Shannon’s 
entropy was generalized to dynamical systems by A. N. Kolmogorov and Y. G. Sinai in their 
entropy theory of dynamical systems (I^ . 

The development of nonlinear dynamics, the theory of chaotic dynamical systems, and 
the theory of non-equilibrium systems required the introduction of appropriate characteris¬ 


tics, such as Lyapunov exponents, Kolmogorov entropy, and Klimontovich’s S-parameter 11 
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It is noteworthy that these parameters are also inherently linked to Shannon’s entropy. 

However, all widely used modifications of the Boltzmann-Shannon entropy measure 
have features that limit their applicability. See, for example [^. Therefore, Renyi and Tsaliss 
proposed new formalisms in addition to the Boltzmann-Shannon entropy. 

In the early 1980’s, A. N. Kolmogorov proposed a fundamentally new, algorithmic, 
approach to the interpretation of the concept of complexity 12 . He formalized the criterion 


in the language of the theory of algorithms and constructed an appropriate measure for it that 
has indubitable informational advantages. However, it is very difficult to apply this measure 
to estimating the complexity of dynamic processes, because the computation involved and 
the interpretation of the results are very laborious. 


Darkhovskii et al. j^, proposed an original approach to calculating the complexity of 
a scalar dynamic process. The approach is based on the idea of information expenditures 
needed to approximate a process to a required degree of accuracy. The approach is concep¬ 
tually similar to the algorithmic approach of A. N. Kolmogorov. Its limitation is that the 
choice of the approximating basis is arbitrary and is not substantiated. 

In radio physics, one actively uses the time-frequency criterion of complexity [^. 
Here the measure is given by the product of the spectral width by the duration of a dynamic 
process: 


AtAu = 4 


-hoo -hoo 1 1/2 -hoo 

J t^x^{t)dt J u>‘^\S{u>)fdu> , 5'(a;) = A J x{t) < 


— lUOt 


dt. 


( 1 . 2 ) 


This criterion does not take into account the shape of the spectrum and operates with 
the effective values of the spectral width and the duration of a dynamic process. All this 
makes the evaluation of the complexity rather conditional. Moreover, measure (1.2) imposes 
constraints on the minimum decay rate of the functions x{t) and |*S'(a;)| and has an energy 
rather than informational meaning. 

Recently, V. I. Arnold has suggested an approach to the calculation of the complexity 
of lattice sequences of the form of Z 2 x Z (sequences of 0 and 1), (^. The method is 
based on the formalization of the structure of sequences: first, a self-mapping for sequences 
is constructed (via cyclic difference), and then this mapping is represented as a graph; the 
complexity of the original sequence is determined in terms of the characteristics of this graph. 
A strong limitation of this method is that the complexity measure constructed cannot be 
transferred to x Z-continuum processes. 


We should also mention the so-called perimetric complexity of binary images . In 
this case, an image can be treated as a two-dimensional scalar field. The strongest limitation 
of the method is that it can be applied only to binary images (from the class Z 2 x Z^). 
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In this article, we propose a different approach to the analysis of the complexity of 
chaotic sequences that is based on the study of the structural properties of the sequences 
in terms of the shape of their trajectories. This approach is free from most of the dis¬ 
advantages of the above-mentioned approaches. It is based on the method of symbolic 
CTQ-analysis 18,19, which is aimed at the study of multidimensional discrete sequences 
and mappings. The formalism of CTQ-analysis studies the properties of dynamical sys¬ 
tems, that are important from the viewpoint of identihcation and control of the systems and 
prediction of their evolution. 


The results were first presented by the author at the XXV lUPAP Conference on 
Computational Physics . The initially proposed complexity measures dealt only with the 
occurrence frequencies of symbols and ignored the order of symbols. In the present paper, 
we remove this restriction, thus expanding the analytical capabilities of the approach to 
estimate the complexity of discrete sequences. 


Moreover, we essentially revise the principles of the symbolic CTQ analysis: we for¬ 
mulate encoding rules for the symbols of the base alphabet in a rigorous and formal manner, 
which allows us to form a complete and self-consistent set of symbols. 

All calculations and visualizations are performed using Wolfram Mathematica 9. 


2. Symbolic CTQ-analysis 


Denote a discrete dynamical system in the form of a mapping 

Sfc+l = f (Sfc, p), 

with the properties: s G S C , A:GKCZ, pGPC n = 1, N, m = 1, M. 


(2.3) 


In formula (2.3), s is a state variable of the system and p is a vector of parameters. 


With mapping ( |2.3 ), we associate its trajectory in space S x K, which has the form of a 
semisequence k = 1, K. 


2.1. T-alphabet 

Define the initial mapping, which encodes (in terms of the final T-alphabet) the shape 
of the n-th component of a sequence 

{4-1. = K^li,..., rrui. (2,4) 

The graphic diagrams illustrating the geometry of the symbols for the fc-th 

sample and the n-th phase variable are shown in Figure [!} 



18,19 


Figure 1. Geometry of T-alphabet symbols. 
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Strictly, the mapping (2.4) is defined by the relations: 


TO 

A5_ 

= 

AS4. 

= 0, 






Tl 

A5_ 

. = 

As-^ 

< 0, 






T2 

A5_ 

. = 

AS4. 

> 0, 






T3N 

A5_ 

< 

0, 

As_|_ 

< 

As- 

- J 



T3P 

A5_ 

< 

0, 

As_|_ 

< 

0, 

As_|_ 

> As_, 

T4N 

A5_ 

> 

0, 

As+ 

= 

0, 




T4P 

A5_ 

< 

0, 

As_|_ 

= 

0, 




T5N 

A5_ 

> 

0, 

As_|_ 

> 

0, 

As_|_ 

< As_, 

T5P 

A5_ 

> 

0, 

As+ 

> 

As_ 

-1 



T6S 

A5_ 

> 

0, 

As_|_ 

< 

0, 

As_|_ 

> 

-A5_ 

T6 

A5_ 

. = 

—As+ > 

0, 





T6L 

A5_ 

> 

0, 

As+ 

< 

0, 

As+ 

< 

-A5_ 

T7S 

A5_ 

< 

0, 

As_|_ 

> 

0, 

As_|_ 

< 

-A5_ 

T7 

A5_ 

. = 

—As+ < 

0, 





T7L 

A5_ 

< 

0, 

As+ 

> 

0, 

As+ 

> 

-A5_ 

T8N 

A5_ 

. = 

0, 

As_|_ 

< 

0, 




T8P 

A5_ 

. = 

0, 

As+ 

> 

0. 




and 

A5+ = 

_ 

■ ®fc+l 








(2.5) 


here As_ = 

Thus, the T-alphabet includes the following set of symbols: 

= {TO, Tl, T2, T3N, T3P, T4N, T4P, T5N, T5P, 

T6S, T6, T6L, T7S, T7, T7L, T8N, T8P}. (2.6) 


One can see from (2.6) that the symbol is encoded as Ti, where i is the right- 

hand side of the symbol codes of the alphabet In turn, the symbol is encoded 

in terms of Tii • • • ijv, see (2.4). The full alphabet T"‘^|A’, which encodes the shape of the 
trajectory of the multidimensional sequence consists of 17'^ symbols. 


2.2. Q-alphabet 


In addition to the symbols we introduce the symbols Q 


OL<~p I 

k k- 


— rponp\ rjiaif I ^acp _ 1 

\n — -‘-k \n ~ i^k U’ * * * ’ ^k I^J 


(2.7) 


All admissible transitions constitute a set of symbols of the alphabet Q“'^ 3 Qk'^ln- 
These transitions are shown in Figure]^ 

The symbol is encoded as Qij, where i and j are the right-hand sides of the 

symbol codes of the alphabet T"”^ for the states k and k + 1, respectively. In turn, the 
symbol is encoded in terms of Q 


In Ji 


Jn, see (2.7). The full alphabet 

which encodes the shape of the trajectory of the sequence consists of 107-^ symbols 

(see Figure [^. 
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Figure 2. Table of the transitions ^ T^+i\n\ admissible transitions are shown in green. 


2.3. Symbolic TQ-image of a dynamical system 

Let us introduce a finite graph 

E''|„), V''|„ C Tr, C Q7, 


( 2 , 8 ) 


where is the vertex set and E^|„ is the edge set of According to its topology, 

the graph T'^^\n is a connected directed graph without multiple arcs but with loops. The 
graph is a particular symbolic TQ-image of the dynamical system with respect to its 

n-th phase variable. 

The set of graphs 

(2.9) 

is a complete symbolic TQ-image of the dynamical system. 

Let us denote the graph T'^^\n corresponding to the full alphabets T"*^ and Q"”^ 
by 


The graph (2.8) can be weighted (on its vertices and edges) by the occurrence fre¬ 
quency of characters * in the sequence 


A* I = 


|M* 


|JM*| 




( 2 . 10 ) 


where | o | is the cardinality of the set and * is a symbol of which the multiset M*|^ consists: 


A^ L :M*L 


k \n 


■■ rnn\T = *, * e T7\T, 
3 : Q7\n\Q = *, * E Q7\Q. 


(2.11a) 

(2.11b) 


Note that the calculation of ( |2.11a[ ) and ( |2.11b| ) allows one to quantitatively assess 
various properties of the trajectory of the sequence {s^k }k=i space x K, including 

the Markov characteristic of the sequence {Tl^‘^\n}k=i R- 
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3. Measurement of TQ-complexity 


The approach presented here to the calculation of the complexity of multidimensional 
discrete mappings and sequences is informally defined by the following statement: The more 
complex is a dynamic process, the more complex is the shape of its trajectory in the space S x 
K. Below, we present this statement in a formalized manner. 

First, to each of the symbols and we assign a numerical value of the 

complexity — the so-called unit complexity of a symbol: C^ln and C^\n- 

The symbol is composite (^; therefore, we first determine the unit complexity 

of their constituent elementary symbols: 


rT =}^ ^“ln = z, 

[2 = U, 

[l S^\n = L, 

C^\n = I 2 = B, . 

[s S^\n = E. 


(3.12a) 


(3.12b) 


The unit complexity of the symbol is represented as C^\n]'^, 

where T jg the transpose operator. The norm of this quantity is defined as C'^\n = C'a |n + 

The values of the components C^\n are given in table 

Table 1. Unit complexities of the symbols (* = N, P; o = S, L). 

TO Tl, T2 T4*, T8* T3*, T5* T6, T7 T6o, T7o 

“cfU 1 2 3 4 2 4 

Cl\n 1 1 2 _ 2 3 3 

C^\n 1 2 4 5 4 6 


Note that the table is compiled on the following key principle: repeated symbols (sub¬ 
sequences) do not increase the complexity of the sequence, since they do not carry new 
information. 


Define the unit complexity of the symbol Qff\n ia terms of the distance between 


and T^“+i|n: 


c«u=dT(Tru. irtiW+i- 


(313) 


The measure dT (•, •) is the number of edges on the shortest path between two vertices in 
the graph (see Figure]^. 

In contrast to the earlier paper , we use a scheme of one-point deformation of the 
subsequence when constructing the graph This construction is closer 

to the classical Levenshtein distance (^, with the following edit transcript: Replace and 
Match j^. Thus, this modification makes the specific complexity of the symbols Q^‘^\n more 
strictly defined. Moreover, the range of values of C^\n becomes balanced with resect to the 
range of C'^\n- 

As already pointed in the introduction, the measures of complexity (based on symbolic 
CTQ-analysis) proposed earlier by the author ignored the order of symbols j^. This is a 
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Figure 3. The graph 
subsequence | , 


corresponds to transitions between the symbols for the fc-th sample of the 

^fe+i i under its various continuous one-point deformations. See 



significant restriction. Indeed, consider two test sequences (letter T in notation of the symbols 
of the T-alphabet is omitted): 

7S 5P 6L 7S 5P 6L 7S 5P 6L 7S 5P 6L 7S 5P 6L 7S ..., (3.14a) 

7S 6L 7S 5P 5P 5P 6L 7S 5P 6L 7S 5P 6L 7S 6L 7S .... (3.14b) 


Intuitively and logically, the sequence (3.14a) is simpler. It shows clear periodicity, which 
can easily be continued. The second sequence (3.14b) is objectively more complicated. It is 
not so easy to continue (the principle of its generation is unclear). At the same time, both 
sequences (3.14a) and (3.14b) have the same set of symbols and differ only in their sequence 
order. Thus, to distinguish sequences (periodic sequence from random and chaotic ones), we 
introduce a reduction procedure. 


When calculating the TQ-complexity of the sequence one should hrst 

reduce it; i.e., one should remove repeated subsequences, because they do not carry new 
information. The reduction rule is illustrated in Figure |4| 


a b c d e f 

337675563375556333755556333376767 

I_I_I_U 

^ 337 67 55 6337 55 633755 6337 67' 

a b c d e f 

337675555633337555563333755556333376767 

I_I_u 

B I ^ ^ ^ 

3376755633767 

a b c g d e f 

337 6'75555 63333'7555563333'75555 63333'3'7 6'7 6'3 
^ '337 6755633755 6337 67 6'3' 


A 

a;^b?^c 

d=e, d.e = e.f 


B 

a=b=c, a.b=b.c=c.d 
d=e, d.e=e.f 


C 


a=b=c, a.b=b.c;*^c.g 
d=e, d.e^^e.f 


Figure 4. Illustration of the reduction rule of the sequences (letter T in notation of the sym¬ 

bols of the T-alphabet is omitted); the sign denotes the boundary between segments (in fact, it is a 
transition ^ i.e., the symbol 


The removal of duplicate fragments is performed starting from longer to shorter ones. 
This condition allows one to distinguish between periodic and quasi-periodic segments. Fur¬ 
thermore, the removal of identical fragments is performed so that, locally (within the deleted 
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blocks), the set of and symbols is preserved. This guarantees the invariance of 

the graph After the application of this rule, we obtain a reduced sequence {T^'^\n}k=i- 

Let us introduce two measures of complexity. 

Degenerate measure C 5 = This measure deals solely with the length 

of a reduced sequence. 

Formally, the unit complexities of the symbols and Q°“^\n are assumed to be 

equal to unity: 

CUn = K', ClQ\n = K'-l. (3.15) 

Weighted measure = [Cgj,, This is an extension of the degenerate mea¬ 

sure. 

That is, the weighted measure also includes the unit complexities of the symbols 
and Q°“^\n- 


I _ 

^STln ~ 


K' 




K'-l 

I _ \ ^ rpa(fi I 

^SQ\n — / ^ ^ \n [-l-k \n^ -L k^i\n 

k=l 


(3.16) 


On the basis of the complexities 
effective unit complexity of a sequence: 


(3.15) and (3.16), we can define the measure of the 


I 

^eu I _ ^So I'hl 

^So\n — I 5 

^So\n 


o = T,Q. 


(3.17) 


Note that, according to their design, the measures are directly related to such issues 
as periodic orbits, entropy of a dynamical system, etc. p 10 . The central element of this 
relation is the spectrum of the reductions . These quantities have the fol¬ 
lowing meaning: is the number of acts of reduction, is the length of a subsequence 

to be reduced, and is the number of removed fragments (it is proved that ^ 2 ). 
Note that a detailed study of this relation is the subject of our future research. 


4. Sample 

Let us demonstrate the capabilities of the tools developed by an example of the analy¬ 
sis of financial time series. The object of analysis is the time series of exchange rates of some 
world currencies (US dollar [USD], Euro [EUR], Japanese Yen [JPH], Swiss Eranc [CHE], 
and British Pound [GBP] against Russian ruble). The analyzed period is from 01.01.1999 
to 31.12.2014. 

Note that the analysis of the TQ-complexity has applied value in the context of 
research in macroeconomics and stochastic financial mathematics. The original data are 
taken from the official web-site of the Central Bank of Russia (Bank of Russia, exchange 
rates, www.cbr.ru/eng/). The length of the time series is K = 3 985 samples. The initial 
time series are shown in Eigurej^ 

Estimates of the weighted measure of the TQ-complexity for the time series of cur¬ 
rency exchange rates are shown in Eigure|^. The figure also presents the values of C 5 for 
the reference stochastic sequence (200 realizations, length of 3 985 samples). The reference 
sequence has normal distribution of discrete differences. The expectation and the variance 
of the distribution are equivalent to those of the initial sequences. 
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RUB 




20.0 20.2 20.4 20.6 20.8 21.0 21.2 21.4 


xio^ C- L 

Figure 6. (a) - Estimates for the weighted measure of the TQ-complexity for the time series of currency 
exchange rates, (b) - The reduction spectrum of the time series of currency exchange rates. 


The results of the analysis imply (see Figure]^) that the pair USD/RUB significantly 
differs from other currency pairs in the TQ-complexity of its time series. Moreover, the com¬ 
plexity of the pair USD/RUB is much lower than the complexity of the reference stochastic 
sequence. 

From this we can draw two preliminary conclusions: (i) the dynamics of the formation 
of the USD/RUB currency pair significantly differs from that of other pairs (perhaps even at 
the level of financial and economic mechanisms); (ii) the time series of the USD/RUB pair 
is easier to predict [^. In principle, these findings are in good agreement and complement 
the previous results of the author . 

In addition, consider the reduction spectrum, which is shown in Figure |^. Figure 
shows that, predominantly, T-subsequences with a length of 1 and 2 samples are reduced. 
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However, the JPH/RUB pair contains one single fragment with a length of 6 samples. It 
should be noted that each T-symbol comprises 3 consecutive samples of the initial sequence. 
Information about the reduction spectrum may also be useful for the analysis of the short¬ 
term predictability of currency exchange rates. 

5. Conclusion 

In this paper, we have proposed a new approach to the quantitative evaluation of the 
complexity of multidimensional chaotic sequences, that is based on the study of the structural 
properties of sequences (in terms of the shape of their trajectories in the space S x K). This 
approach is free from most of the disadvantages of existing methods for estimating the 
complexity of dynamic processes. The algorithm is based on the method of symbolic CTQ- 
analysis. This algorithm operates not only with the frequency of occurrence of symbols, but 
also takes into account the sequence order of the symbols. 
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